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ABSTRACT 

r 

Given  an  arbitrary  point  (x,u)  in  Rn  x  R™ ,  we  give  bounds  on  the  Euclidean  dis¬ 
tance  between  x  and  the  unique  solution  x  to  a  strongly  convex  program  in  terms  of 
the  violations  of  the  Karush-Kuhn-Tbcker  conditions  by  the  arbitrary  point  (x,  u).  These 
bounds  are  then  used  to  derive  linearly  and  superlinearly  convergent  iterative  schemes 
for  obtaining  the  unique  least  2-norm  solution  of  a  linear  program.  These  schemes  can 
be  used  effectively  in  conjunction  with  the  successive  overrelaxation  (SOR)  methods  for 
solving  very  large  sparse  linear  programs.  *■ 
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SIGNIFICANCE  AND  EXPLANATION 

We  derive  bounds  on  the  distance  between  an  arbitrary  point  and  the  unique  solution 
of  a  strongly  convex  constrained  optimization  problem  in  terms  of  known  violations  of  the 
optimality  conditions  of  the  problem.  These  bounds  are  then  used  to  construct  effective 
schemes  for  finding  the  unique  smallest  solution  of  very  large  sparse  linear  programs. 
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ERROR  BOUNDS  FOR  STRONGLY  CONVEX  PROGRAMS  AND 
(SUPER) LINEARLY  CONVERGENT  ITERATIVE  SCHEMES  FOR 
THE  LEAST  2-NORM  SOLUTION  OF  LINEAR  PROGRAMS 


O.  L.  Mangasarian  and  R.  De  Leone  U 


1.  Introduction 

We  consider  the  problem 

(1.1)  min  f(x)  subject  to  x  6  S:  -  {x\x  >  0,  g(x)  <  0} 

where  f\Rn—>R  and  g:Rn  —>  Rm  are  differentiable  and  convex  functions  on  Rn,  S  is 
nonempty  and  in  addition  /  is  strongly  convex  on  Rn ,  that  is 

(u2)  (v/(y)  -  v/(*))(v  -  z)  >  fc  \\y  -  A\\ 

for  all  x,  y  in  Rn  and  some  k  >  0,  where  ||  •  j|2  denotes  the  2-norm.  It  follows  immediately 
that  (l.l)  has  a  unique  solution  x  in  5.  Our  purpose  here  is  that  given  any  x  in  f?"  to 
obtain  a  bound  on  the  distance  ||x  -  x||2,  in  terms  of  the  violations  of  the  Karush-Kuhn- 
Tucker  conditions  for  (1.1)  by  x  and  any  nonnegative  u  in  Rm  (Theorem  2.2),  or  by  x  and 
an  “optimal”  u  chosen  by  solving  a  single  linear  program  (Remark  2.6).  The  error  bound 
(2.7)  of  Theorem  2.2,  which  is  also  a  Lipschitz  continuity  result  of  order  \  (see  (2.13)), 
involves  3  parameters  a,/3,"7  which  may  not  be  readily  computable.  In  Theorem  2.5  we 
replace  these  parameters  by  corresponding  upper  bounds  a(xo),  (3{x,u),  -y(x,  ti)  which  are 
readily  computable  from  any  primal  feasible  xq  and  any  primal-dual  feasible  point  (x,  u) 
which  satisfies  the  primal  Slater  constraint  qualification.  Related  Lipschitz  continuity 
results  are  given  by  Daniel  in  [3;  for  positive  definite  quadratic  programs.  Stronger  local 
Lipschitz  continuity  results  for  more  general  programs  are  given  by  Robinson  in  { 17,18] . 

In  Section  3  of  the  paper  we  turn  our  attention  to  what  motivated  the  paper  originally, 
namely  computing  the  least  2-norm  solution  of  a  linear  program.  Determination  of  the 
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least  2-norm  solution  of  a  linear  program  has  been  the  keystone  of  the  successive  over- 
relaxation  (SOR)  methods  for  solving  very  large  sparse  linear  programs  not  solvable  by 
standard  pivotal  packages  [9,10].  The  first  result  of  Section  3  is  that  the  2-norm  (|x{| 2  of 
any  solution  x  of  a  linear  program  bounds  the  Euclidean  distance  ||ar  —  i|| 2  between  x  and 
the  least  2-norm  solution  of  the  linear  program.  This  inequality,  ||i  -  x||2  <  ||x|| 2 *  which 
is  obviously  valid  for  any  two  points  x  and  x  in  the  nonnegative  orthant  i?”  if  x  >  x,  is 
not  valid  if  we  merely  have  j| x |j 2  >  ||x||2  as  can  be  seen  from  the  simple  example  in  R 2 
ofi=G),i=(?)  where  ]|x |j 2  =  ||x||2  <  ||x  _  xr j| 2  -  Theorem  3.2  gives  an  improved 
bound  on  ||x  -  x||2  by  solving  a  linear  program.  The  final  and  computationally  important 
results  of  this  paper,  contained  in  Theorems  3.7  and  3.8,  are  linearly  and  superlinearly 
convergent  schemes  for  determining  the  least  2-norm  solution  of  a  linear  program.  We  give 
the  essence  of  these  results.  In  solving  very  large  sparse  linear  programs  one  solves  by  an 
SOR  technique  [8,9,10]  a  quadratic  perturbation  (3.3)  of  the  linear  program  (3.1)  for  “suf¬ 
ficiently  small”  value  e  of  the  perturbation  parameter  e,  that  is  e  £  (0,  e]  for  some  e  >  0. 
Until  now  there  was  no  simple  way  of  determining  when  e  <  e .  Theorems  3.7  and  3.8  do 
this  as  follows.  Given  a  value  et  of  the  perturbation  parameter,  we  approximately  solve 
the  quadratic  perturbation  problem  (3.3)  for  x(et)  by  an  SOR  or  any  other  procedure  to 
a  residual  accuracy  r(e,)  defined  by  (3.14).  Then  we  decrease  e,  to  et  +  i  =  /ze,,  /z  £  (0, 1) 
and  solve  (3.3)  to  a  residual  accuracy  r(e,  +  i)  such  that 

(1.3)  r(et_n)  <  ut{e%)  for  some  v  <  /z1^2  for  linear  convergence 


(1.4)  r  (e-.-)  <  t] p  for  some  £  >  0 ,  ye  (0, 1).  p  >  1  for  superlinear  convergence 


Theorem  3.7  shows  that  the  sequence  of  approximate  solutions  (x(£,)}  thus  generated 
converges  to  the  unique  least  2-norm  solution  of  the  linear  program  (3.1)  at  a  linear  rate 
under  (13).  while  Theorem  3.8  establishes  p- rate  superlinear  convergence  under  (1.4). 

We  briefly  describe  now  our  notation  and  some  basic  concepts  used.  For  a  veclor 
1  in  the  n-dimensional  real  space  Rn ,  x  and  x4  will  denote  the  vectors  in  Rn  with 

components  x,  =  x,  and  (x*),  --  max  {x,.0}.  1  -  1 . n  respectively.  For  a  norm 

x  p  on  Rn .  the  dual  norm  x  .<•  on  Rn  will  be  defined  bv  x  ,y»:  -  max  xy,  where 


xy  denotes  the  scalar  product.  The  generalized  Cauchy-Schwarz  inequality  \xy\  <  ||x||0  • 
i|y||p*,  for  x,y  in  Rn ,  follows  immediately  from  this  definition  of  the  dual  norm.  For 

n 

1  <  P,  9  <  <x>,  and  j;  -b  1  =  1.  the  p-norm  (  ^  jx,jr)  /P  and  the  g-norm  are  dual  norms 

i  =  i 

on  Rn  )6].  If  i|  •  |jp  is  a  norm  on  Rn .  we  shall,  with  a  slight  abuse  of  notation,  let  |j  •  jj# 
also  denote  the  corresponding  norm  on  Rm  for  m  ^  n.  will  denote  the  nonnegative 
orthant  or  the  set  of  points  in  Rn  with  nonnegative  components,  while  Rmxn  will  denote 
the  set  of  all  m  x  n  real  matrices.  For  A  6  Rmxn,  AT  will  denote  the  transpose,  At  will 
in  general  denote  the  ith  row,  while  will  denote  the  matrix  norm  [1,13]  subordinate 

to  the  vector  norm  ||  •  || /? ,  that  is  j|A||p:  =  max  ||Ax!l«.  The  consistency  condition 
i|Ax||p  <  ||i4||p||x||^  follows  immediately  from  this  definition  of  a  matrix  norm.  We  shall 
also  use  ||  •  ||  to  denote  an  arbitrary  vector  norm  and  its  subordinate  matrix  norm.  For  an 
x  in  Rn  we  shall  make  use  of  some  of  the  following  norm-equivalence  inequalities  [19] 

l1-5)  i!x||oo  <  |!x|!2  <  ||X||1  <  y/n\\x\\2  <  n|jx|]oo 


A  vector  of  ones  in  Rn  for  any  integer  n  will  be  denoted  by  e.  For  a  differentiable  function 
9- Rn  —*  Rm,  Vp(x)  will  denote  that  m  x  n  Jacobian  matrix  at  x.  Similarly  for  a  dif¬ 
ferentiable  function  L(x,u):  (x,u)  <=  Rn^m  R ,  vIL(x,u)  will  denote  the  n-dimensional 
gradient  vector  with  respect  to  x,  while  Vu L(x,u)  will  denote  the  m-dimensional  gradient 
vector  with  respect  to  u . 


2.  Error  Bounds  for  Strongly  Convex  Programs 

We  first  need  a  preliminary  lemma  which  is  essentially  Lemma  2.1  of  [11]  for  the  case 
when  /  is  strongly  convex.  Consider  the  dual  of  our  nonlinear  program  (1.1)  [7] 

max  L(x,  u)  —  x  Vz  L(x,u) 

(2.1) 

subject  to  (x,u)  €  T:=  {(z,u)|u  >  0,  Vi L(x,u)  >  0} 
where  L{x,u)  is  the  standard  Lagrangian 


L(z,u):  =  f(x)  +  ug(x). 


The  Karush-Kuhn-Tucker  (KKT)  optimality  conditions  for  (1.1)  are  [7] 


(2.2) 


v  -  S/zL(x.u)  -  V/(^)  +  «  V  ff(z)  >  0,  x  >  0,  xv  -  0, 
y  =  -  Vu  L(x,v)  =  ~g(x)  >  0,  u  >  0,  uy  =  0 


If  we  make  the  definitions 


(2.3) 


z: 


V*L(i t,u)  \ 
Vu  L(x,  u)J 


then  the  Karush-Kuhn-Tucker  conditions  take  on  the  equivalent  complementarity  formu 
lation  2j 


(2.4) 


2  >  0,  xv  -  F(z)  >  0,  ztv  -  0 


Our  preliminary  lemma  establishes  the  strong  monotonicity  of  the  “twisted”  derivative 
F(z)  under  the  strong  convexity  of  /  and  convexity  of  g. 


2.1  Lemma  Let  /  and  g  be  differentiable  on  Rn ,  let  g  be  convex  on  Rn  and  let 
/  be  strongly  convex  on  Rn  with  positive  constant  k,  then  F(z)  as  defined  in  (2.3)  is 
continuous  and  strongly  monotone  with  respect  to  i  on  Rn  x  R™  ,  that  is  for  all  z:  = 

and  z:  -  ^  ^  in  Rn  x  R ™ 

2 


(2.5) 


r)(F(z)  F(z))  -k  x  x 


Proof  Just  replace  the  last  inequality  of  the  proof  of  Lemma  2.1  of  11  by  the  inequality 
of  (1.2)  above.  I 


We  can  now  state  and  prove  two  error  bound  results. 

2.2  Theorem  (Error  bound  in  terms  of  KKT  residuals)  Let  /:  Rn  ->  R,  g.  Rn  —*  Rm  be 
differentiable  on  Rn ,  let  /  be  strongly  convex  on  Rn  with  positive  constant  k  and  let  g 
he  convex  on  Rn .  Let  either  g  be  linear  and  5  ^  <p,  or  let  g  satisfy  the  Slater  constraint 


qualification,  that  is 


g( x)  <  0,  x  >  0 


for  some  x  C  Rn  .  Then  for  any  (x,u)  €  Rn  x  R™  the  distance  ;|x  -  x||2  to  the  unique 
solution  x  of  (1.1)  is  bounded  by 

;  X  -  X  j2  <  fc_1/2[x  Vi  L[x,u)  -  ug(x)  +  o||(-  Vz  L(x,u))  J|i 


where 


'lll(~x)- 


(2-10) 


o:  =  min(j|x||00  +  ||  V  f{x)\\i/k) 

zgS 


0:-  min  j|u||i 

(u,«)ew 


'y:  -  mm  It;  j 

(u  ,v)€W 


where  IT  .  R"'~r'  is  the  nonempty  closed  convex  polyhedral  set  of  optimal  multipliers 
(u.?')  of  the  convex  program  (1.1)  associated  with  the  constraints  g(x)  <0,  x  >  0. 

Proof  Since  S  *-  O  and  /  is  strongly  convex,  the  program  (l.l)  has  a  unique  solution 
x.  Since  eilher  g  is  linear  or  the  Slater  constraint  qualification  (2.6)  is  satisfied  there  exist 
optimal  Lagrange  multipliers  (u,v)  G  R™+n  such  that  (x,u,t>)  satisfy  the  KKT  conditions 
(2.2)  7  and  hence  the  set  IV  of  optimal  Lagrange  multipliers  (u,v)  is  nonempty,  closed 
and  convex  and  in  fact  polyhedral  here.  Now  for  any  x  G  5  we  have 

\7(x)  ,•  X  X  v  >  y/(x)(x -£)  >  {vf(x)~  v/(x))(x  x)  >  k  x-x  l>  k  X  x  \ 
where  the  second  inequality  follows  from  the  minimum  principal  7j.  Hence 


X  cv  -  !  xl'oo  <  '|x  -  x-'-oo  <  v  /(x)  I  k 


o 


nrnww 


and  since  x  is  an  arbitrary  point  in  S  it  follows  that 


(2.11)  Ijxlico  <  min  (lixjjoo  +  |i  V  f{x)\\i/k)  -  a 

t£S 

where  the  minimum  exists  because  of  the  continuity  of  the  minimand  on  Rn  and  the 
compactness  of  its  level  sets.  Now  let  z:  =  (x,  u)  E  Rn  x  R™,  (u,v)  €  W  and  let  z:—(x,u). 
Then  by  Lemma  2.1  we  have  that 

k]\x  -  *||*  <  (2  -  f)(F(a)  -  F(z)) 

=  zF{z)  -  zF(z)  -  zF(z)  (Since  zF(z )  =  0) 

<  zF[z)  +  z{^F{z))+  +  yIL(x,u)(-x)  + 

(Since  ug(x)  <  0  and  £  <  £+) 

:  X  Vi  L(x.u)  ug( x)  -+  x(  -  Vx  £(*,«))+  •+  u(g(x))4  -i  v(-x)  + 

<  x  Vi  L(x,u)  ug(x)  f  jjxjioo  •  j:(-  Vx  L(x,u))  J|i 

u  !i«  i  •  U(»(*))4tloo  +  lK’!li  •  !i(~  x)  +  ||oo 

Since  (u.v)  is  an  arbitrary  point  in  IV  it  follows  that  in  the  last  expression  above,  ||uj|i 
and  ;  v  |]  can  be  replaced  by  their  respective  minima  over  W  ,  while  Ijxlloo  can  be  replaced 
by  its  upper  bound  a  given  by  (2.11).  Using  the  definitions  (2.9)  and  (2.10)  we  have  then 

Ml*  ■  m  -  1  Vx  L(z,u)  -  ug{x)  +  a||(-  Vx  £(*,«))  +  ||i 
-I  /?ll(9(x))  +  ||oo  +  Tf||(-x)+|loo 

from  which  (2.7)  follows  immediately.  I 

2.3  Remark  Note  that  the  error  bound  of  (2.7)  is  zero,  if  arid  only  if  x  satisfies  the 
Karush-Kuhn-Tucker  conditions  (2.2)  for  some  u  f  R™  .  In  fact  if  we  define  a  perturbation 
vector  p  (p i.  P2 •  p.?,  Pa)  r  +  ”  and  define  x(p)  <  Rn  to  be  a  solution  of  the 

perturbed  Karush-Kuhn-Tucker  conditions 


for  some  u  in  R™ ,  then  x(0)  =  x,  the  unique  solution  of  (1.1).  It  follows  then  from  (2.7) 
that 

(2.13)  |jx(p)-x(0)i|2<  A|;pHJ/2 
where 

(2.14)  A  =  (max  {1, a,/3,'y}/fc)1/2 

The  relation  (2.13)  shows  that  x(p)  is  Lipschitzian  of  order  | ,  with  a  Lipschitz  constant 
A,  at  p  =  0. 

If  the  point  (x,u)  of  Theorem  2.1  is  both  primal  and  dual  feasible,  the  bound  (2.7) 
of  Theorem  2.1  simplifies  considerably  as  indicated  in  the  following. 

2.4  Corollary  (Error  bound  for  primal-dual  feasible  points)  If  in  addition  to  the  as¬ 
sumptions  of  Theorem  2.2,  x  is  primal  feasible  and  (x.u)  is  dual  feasible,  that  is  x  €  S 
and  (x,  u)  G  T,  then 

1 

(2.15)  |;x  -  x|  2  <  ((*  Vx  L{x,u)  -  ug{x))/l r) 

This  corollary  partially  extends  a  result  of  (12,  Equation  2.15]  for  error  bounds  for 
positive  semidefinite  quadratic  programs  to  strongly  monotone  convex  programs.  Pang 
has  given  related  error  bounds  for  nonlinear  complementarity  problems  f  15]  and  linearly 
constrained  variational  inequalities  16  . 

We  note  that  the  error  bound  of  (2.7)  contains  3  parameters  a,/3,  and  -y  which 
may  not  be  easy  to  compute.  These  parameters  can  be  replaced  by  bounds  which  are 
more  easily  computable.  In  particular,  if  we  let  x°  be  any  primal  feasible  point,  and  let  x 
satisfy,  in  addition  to  the  Slater  constraint  qualification  (2.6),  the  dual  feasibility  condition 
(x,u)  6  T  for  some  u,  then  we  have: 

(2.16)  q  <  a(x"):  =  ]!x0jjoo  +  Jj  V  /{*")  l/^ 

(2.17)  ,3  <  ^(i.i):=  (iyj  L(i.ii)  -  us(i))/min  -  9i(x) 


(2.18) 


-y  <  "7(x,  u):  -  (x  Vi  L{i'  u)  -  ug(x))  /  min  x 


where  the  inequality  of  (2.16)  follows  immediately  from  the  definition  (2.8)  of  o  and  the 
inequalities  (2.17)  and  (2.18)  from  Theorem  2.2  of  |llj.  We  therefore  have  the  following. 

2.5  Theorem  (Explicit  error  bound  in  terms  of  KKT  residuals)  Let  the  assumptions 
of  Theorem  2.2  hold  including  (2.6),  let  x°  6  5  and  let  ( x,u )  €  T  for  some  u  €  /?+  . 
Then  for  any  (x,u)  f  Rn  x  R™  the  distance  \x  -  x||2  to  the  unique  solution  x  of  (1.1)  is 
bounded  by 

i  x  2  <  k  1/2[x  V2  L(x,u)  -  ug(x)  +  a(x°)||(-  Vi  £(1>u))+lli 
C-M9)  ,1/2 

+  /3(x,u)||(ff(x))  +  ||oo  +  T^MJIIf-zJ+lloo] 

where  o(x"),  J(x,u  )  and  ^(x,^)  are  defined  by  (2.16)-(2.18). 

2.6  Remark  We  note  that  for  a  fixed  x,  x",  x  and  u,  the  choice  of  u  in  the  bound 

(2.19)  can  be  optimized  by  solving  the  following  linear  program  in  order  to  obtain  the  best 
bound  on  x  x|i2 : 

min  u(v</(x)x  -  ^(x))  +  a(xu)e.s 

( u, .-)►«’"»  K" 

(2.20)  V  f(x)  -  u  V  ff(x)  <  5 

u,s  >  0 

I'nder  the  assumptions  of  Theorem  2.5,  the  objective  function  of  the  feasible  linear  program 
(2.20)  is  bounded  below  and  hence  is  solvable.  Any  solution  (u,s)  of  (2.20)  will  provide 
an  optimal  u  which  will  give  the  best  bound  in  (2.19)  for  the  given  fixed  x,  x  x  and  u. 


3.  Application  to  Least  2-Norm  Solution  for  Linear  Programs 

In  this  section  we  use  the  error  bound  for  strongly  convex  programs  to  derive  two 
simple  bounds  (Theorems  3.1  and  3.2)  for  the  least  2-norm  solution  of  a  linear  program  in 
terms  of  any  other  solution  of  the  linear  program.  More  importantly  we  give  in  Theorems 
3.7  and  3.8  linearly  and  superlinearly  convergent  iterative  procedures  for  determining  the 
least  2-norm  solution  of  a  linear  program.  The  proposed  schemes  should  be  very  helpful 
in  precisely  determining  the  manner  in  which  the  perturbation  parameter  £  and  its  corre¬ 
sponding  error  residual  r(e)  (3.14)  should  be  decreased  in  the  highly  effective  successive 
overrelaxation  methods  for  solving  very  large  sparse  programs  '8,9,1  O' . 

We  consider  the  linear  program 


(3.1)  min  cx  subject  to  Ax  >  b,  x  >  0 

I 

where  c  ~  Rn.  b  Rrr>  and  .4  6  /Zmxn,  and  its  dual 

(3.2)  max  bu  subject  to  ATu  <  c,  u  >  0 

U 

It  is  known  9.10  that  x  is  the  unique  least  2-norm  solution  to  (3.1)  if  and  only  if  x  is  the 
unique  solution  to  the  quadratic  program 


(3.3) 
for  all  e 


(3.1) 


min  cx  +  -xx  subject  to  Ax  >  b,  x  >  0 

x  2 


(0.£  for  some  i  >  0.  The  dual  to  the  quadratic  program  (3.3)  is  }7j 


max  xx  +  bu  subject  to  v  =  ex  -  Aru  -f  c,  (u,v)  >  0 

i  . u  .»■  2 


Now  if  ( x.u )  is  an  arbitrary  optimal  point  for  the  dual  linear  programs  (3.1  )-(3.2). 
then  for  any  r  ■  (0.  £  .  the  point  (i. u,  t>:  =  ex  -  ATu  c)  is  feasible  for  the  dual  quadratic 
program  (3.3)-(3.4)  and  hence  by  Corollary  2.4 


i 


/  cx  -i-  exx  -  bu\i<1 2 


!j  i  j ;  2 


where  x  is  the  least  2-norm  solution  of  the  linear  program  (3.1).  Hence  we  have  established 
the  following. 


9 


3.1  Theorem  (Bound  for  the  distance  between  an  LP  solution  and  the  least  2-norm  LP 
solution)  For  the  linear  program  (3.1) 


(3-6) 


II*  ~  *lla  <  11*11 


2 


where  x  is  any  optimal  solution  to  (3.1)  and  x  is  the  unique  optimal  solution  to  (3.1)  with 
least  2-norm. 

We  can  improve  on  the  bound  (3.6)  if  instead  of  using  u  which  is  a  solution  of  the 
linear  program  (3.2)  we  use  u(z,e),  which  minimizes  the  bound  of  (2.15)  for  the  given 
linear  program  solution  x,  and  such  that  (x,u(x,e))  is  feasible  for  the  dual  quadratic 
program  (3.3).  Hence  we  take  u(x,e)  a s  a  solution  of  the  linear  program 


(3.7) 


max  bu  subject  to  AT u  <  c  +  ex,  u  >  0 

U 


This  linear  program  is  solvable  because  it  is  feasible  (its  feasible  region  contains  that  of 
(3.2))  and  its  objective  function  is  bounded  above  by  (c  -  £rx)x.  Hence  bu(i,e)  >  bu  -  cx 
and  the  bound  (3.5)  is  improved  as  follows 


(3.8) 


]X  -  X  2 


<  (xx 


6u(x,e)  cx\'/2 


cx  j 


1  il*i.2 


Since  the  bound  of  (3.8)  is  valid  for  all  e  f  (0,  s  and  bv(x.e)  is  a  bounded  nonincreasing 
function  of  e  we  can  take  its  limit  as  e  J  0.  We  summarize  this  result  in  the  following. 


3.2  Theorem  (Optimal  bound  for  the  distance  between  an  LP  solution  and  the  least 
2-norm  LP  solution)  For  the  linear  program  (3.1)  the  following  bound  holds  where  x  is 
any  optimal  solution  to  (3.1),  x  is  the  least  2-norm  solution  of  (3.1)  and  u(i)  is  a  solution 
of  the  linear  program  (3.7): 


(3.9) 


6u(x,  s)  -  cx  y /- 


The  following  example  illustrates  the  bounds  (3.6)  and  (3.9). 

3.3  Example  min  xj  s.t.  -  x\>  2.  x2  >  1.  (xj.  i;)  >  0 

Problem  (3.7)  for  this  LP  with  x  is 

(3.10)  max  2ux  -  u-j  s.t.  u  j  s.  ?/2  1  •  s.  ()/j.  t/j)  '  0 


10 


The  primal  solution  set  is  S  =  {x  6  i?2jO  <  X\  <  2,  12  =  1}  and  the  least  2-norm  solution 

/0\ 

is  1  =  I  1 .  We  then  have 


M2  =  1  =  !l*  -  *||2  <  PH2  =  V2 


which  is  the  bound  (3.6).  The  solution  to  (3.10)  is  u(x,  e)  -  ^  ^  and  hence  6u(x ,  e) 

1  +  e  and  the  bound  (3.9)  gives 


\  =  \  x  -  x\\i  < 


/  1  -r £  -  ]  \  !/2 

lim  ( 2  -  - - )  -  1 

tjo  \  e  ) 


which  is  a  sharp  bound  for  this  problem. 


3.4  Remark  We  note  that  under  certain  assumptions,  such  as  the  strong  second  or¬ 
der  sufficient  optimality  condition  and  linear  independence  of  the  gradients  of  the  active 
constraints  [4,  p.44]  the  function  bu(x ,  c)  is  differentiable  with  respect  to  e  at  e  -  0 
and  j^(bu(x,  e))|g_0  =  xx.  For  such  a  case  the  bound  (3.9)  degenerates  to  (since  cx  - 
6u(x,  0)) 

\]x  -£],]<  XT  jjbu{x.  £))  e  u  0 

and  hence  x  =  x,  which  of  course  is  the  consequence  of  the  second  order  sufficient  opti¬ 
mality  condition  which  implies  that  x  is  a  locally  and  hence  globally  unique  solution  of 
the  linear  program  (3.1). 

We  conclude  by  giving  linearly  and  superlinearly  convergent  procedures  for  obtaining 
the  least  2-norm  solution  of  the  linear  program  (3.1)  based  on  the  error  bound  (2.7). 
These  procedures  should  be  very  useful  in  the  successive  overrelaxation  (SOR)  procedure 
for  solving  (3.3)  8,9:.  The  usefulness  comes  in  determining  a  method  for  cutting  the  size 
of  the  parameter  e  in  (3.3)  and  the  accuracy  to  which  (3.3)  is  solved  for  each  e .  This 
results  in  a  precise  scheme  that  drives  e  below  the  value  e.  which  in  general  is  unknown 
and  very  difficult  to  compute.  We  first  outline  bow  ihe  proposed  procedure  is  applied.  To 
solve  (3.3)  for  a  fixed  e ,  we  apply  an  SOR  procedure  8.9!  or  any  other  procedure  to  its 
dual  (3.4)  with  the  variable  x  eliminated  through  the  dual  constraint 


and  thus  obtaining  the  dual  problem 


(3.12) 


min  0(u,v):=  min  -  \\ATu  +  v  -  c|u  -  ebu 

(u,t;)>0  (u,v)>0  2 


which  would  have  to  be  solved  for  a  sufficiently  small  e  £  (0, e).  Since  we  do  not  know  a 
priori  how  small  e  need  be,  we  consequently  need  to  solve  (3.12)  for  a  decreasing  sequence 
of  e  values.  If  an  iterative  procedure  such  as  SOR  is  used  to  solve  (3.12),  as  in  the  case  of 
very  large  sparse  linear  programs  [8,9],  we  would  have  a  procedure  with  an  infinite  inner 
loop.  Our  present  proposed  approach  now  eliminates  the  need  to  solve  (3.12)  exactly  and 
consists  of  solving  (3.12)  only  to  an  explicit  finite  accuracy  after  which  e  is  decreased 
sufficiently  to  generate  a  linear  or  superlinear  rate  of  convergence  of  the  overall  procedure. 

To  define  our  procedures  we  need  to  define  approximate  and  exact  solutions  to  (3.12) 
and  (3.3).  For  that  purpose  we  first  give  the  necessary  and  sufficient  Karush-Kuhn-Tucker 
optimality  conditions  for  (3.12): 


(a) 

Vu#(u,v) 

= 

A(Atu  +  v  -  c)  — 

(6) 

u  Vu  0{u,v) 

- 

0 

(c) 

u 

> 

0 

M 

V,'0(u,v) 

= 

Atu  +  v  -  c  >  0 

(0 

V  Vv  %iv) 

- 

0 

(/) 

V 

> 

0 

Now  we  make  the  following  definitions. 

3.5  Definition  (Exact  solutions  to  (3.12)  &  (3.3) )  For  a  fixed  positive  e  an  exact  solution 
to  the  dual  quadratic  program  (3.12)  is  designated  by  (u(e),  t>(e))  and  hence  must  satisfy 
(3.13).  The  corresponding  z(e:)  in  Rn  defined  by  (3.11)  with  (u,r)  =  (u(e),  v(e))  is  an 
exact  solution  to  the  quadratic  program  (3.3).  The  set  of  all  (u(e),  u(c))  which  are  exact 
solutions  to  (3.12)  for  a  fixed  positive  e  is  designated  by  ^(e). 

3.6  Definition  (Approximate  solutions  to  (3.12)  &  (3.3))  For  a  fixed  positive  e  any; 

point  in  js  an  approximate  solution  to  the  dual  quadratic  program  (3.12)  and 

is  designated  by  (u(c),  »’(f)) .  The  corresponding  x(e)  in  Rn  defined  by  (3.11)  with 


(u.r)  (u(e),  v{e))  is  an  approximate  solution  to  the  quadratic  program  (3.3).  The 

residual  r(e)  associated  with  (u(e),  v(e),  x(e))  is  defined  by 

(3.14)  r(e):  =  [|i(e)u(e)  +  u(ir)(Ax(£:)  -  6)|  +  ||(6  -  Ax(e))  +  ||oo  +  ||(-*(0)  +  lloo) 1/2 

Note  that  for  an  £  >  0  and  an  approximate  solution  (u(e),  v(e))  to  (3.12)  and  a 
corresponding  approximate  solution  x(e)  to  (3.3),  r(e)  =  0  if  and  only  if  (u(e),  t>(e))  G 
W'(f)  and  x(e)  =  x(s-).  We  also  have  that  for  e  G  (0,e]  for  some  i  >  0,  x(e)  =  x,  where 
x  is  the  least  2-norm  solution  of  the  linear  program  (3.1)  [9,10]. 

We  are  prepared  now  to  state  and  prove  our  linearly  and  superlinearly  convergent 
procedures  for  computing  the  least  2-norm  solution  of  the  linear  program  (3.1)  and  we 
begin  with  the  former. 

3.7  Theorem  (Linearly  convergent  procedure  for  least  2-norm  solution  of  a  linear  pro¬ 
gram)  Assume  that  the  linear  program  (3.1)  is  solvable  and  that  6^0.  Let  {eo, 
be  a  decreasing  sequence  of  positive  numbers  such  that 

(3.15)  e,+  i  =  nex  for  some  n  6  (0,1) 

and  let  {u(e,t'),  v(ff,-),  x(e,)}  be  a  corresponding  sequence  of  approximate  solutions  to 
(3.12)  and  (3.3)  satisfying  Definition  3.6  and  such  that  their  residuals  as  defined  by  (3.14) 

satisfy 

(3.16)  r(er,+1)  <  i/r(e<) 
for  some  u  >  0  and  such  that 

(3.17)  i/  <  n'/* 

Then  the  sequence  (x(e,)}  converges  to  x.  the  least  2-norm  solution  of  the  linear  program 
(3.1).  at  the  linear  root-rate  14] 

(3.18)  ||x(<Ti)  -  x||2  <  6[is/nl/2y  for  t  >  t 
for  some  constant  6  and  some  integer  i. 


Proof  By  Theorem  2.2  we  have 


(3.19) 


where 


jjx(£t)  -  x(e,)||2  <  £,  1/2[|x(£,>(£t)  +  u(£,)(/lx(£,)  -  6)  | 


+  0(e,)||(J>  -  >Met))  ,  ||oo  +  7(e,)||(-x(£:,))  |!oo] 


(3.20a)  /3(e,):=  min  llulji  =  min  <  eu 

(u,u)eiv(£,)  (u,u)>o  1  bu 


ATu  +  v  -  £,x(e,)  + 


cx(et)  +  £,x(e,)x(£I)/2 


(3.20b)  min  llvlli  =  min  <  ei> 

(u,i))€W (i.)  (u,d) >0  I  bu 


Atu  +  v  =  £]x(f|)  +  c 


cx(e,)  +  £lx(e,)x{£l)/2 


By  the  fundamental  theorem  for  the  existence  of  basic  feasible  solutions  for  lin¬ 
ear  programs  (5,  Theorem  3.3],  it  follows  that  for  each  e, ,  there  exist  basis  matrices 
B  i(e,),  B2(£,) ,  that  is  (n  +  1)  x  (n  +  1)  nonsingular  submatrices  of  ,  such  that 


(3.21a) 


/?(£,)  =  (e  0) /? i  (e,)‘ 


-i  / eii{ei)  +  c 

\cx(e-,)  +  e,x(e,)x(ej)/2 


(3,2.b)  !(,)=(« 

r at  i 

Since  there  are  only  a  finite  number  of  basis  matrices  in  ,  n  .  we  have  that  upon 

o  U  J 

taking  B  as  that  basis  matrix  with  largest  1-norm, 


(3.22) 


=  ||£-,|h  •  (||e,x(e,)  +  c||i  +  |cx(e,)  +  M(ei)x(£,)/2j] 


(3.23) 


x(£,)  -  x  for  e  €  (0,e] 


where  x  is  the  unique  least  2-norm  solution  of  the  linear  program  (3.1);  and  for  r , 
have  from  (2.11)  that 


(3.24) 


|x(cra)||i  <  n||x(£t):|uv,  •;  n  min  { ||x||oo 


■+  cjj  1 1 


t  -  -  '  £  \Ax  ■  b .  x  •  0} 


•  ill  ■  e,.  *  :i  +  I|cili  I  4  ,  -  n, 

v  n  mm  (I'x  i,v  •+  -  -  -  -  -  (.-lx  •  b.  x  j-  0)  :t 


w- 


Using  (3.23)  and  (3.24)  in  (3.22)  gives 


(3.25) 


/%i)  or  7(es)  <  j|fl  1 1| i  [max  {e|j£||i  +  ||c|j,,  e, 0r  +  ||cj|,} 


+  max  {|ex|  +  exx/2,  ||c||oor  +  £o7'2/2}]  =: 0 


Hence  combining  (3.19)  and  (3.25)  gives 


(3.26) 


ix(e.)  -  x(e,)||2  <  e,  l/2°r(et) 


where  r(e,)  is  the  residual  defined  in  (3.14)  and 
(3.27)  a:  =  (max  {1,/?})1/2 


From  (3.15)  we  have  that 

(3.28)  e,  =  fi'eo 

and  from  (3.16)  we  have  that 


(3.29) 


r(e.-)  <  vx  r(fo)  ,2  =  0,1,... 


Combining  (3.26),  (3.28)  and  (3.29)  gives 

(3.30)  ||x(et)  -  x(et)i|2  <  oEq 1/2  r(e0)  (W/*1/2)‘ 

Observing  that  ^/M172  <  1  from  (3.17),  and  that  x(e-,)  =  x  for  e,  £  (0, e]  it  follows  that 
lim  x(e,)  -  x.  By  defining 

t— *oo 

(3.31)  6:=  as^'/2  r(eQ) 


and  j  as  the  smallest  integer  such  that  e,  <  e,  we  have  from  (3.30)  and  the  fact  that 
x(e,)  =  x  for  e,  <  £,  that  for  i  >  i 


(3.32) 


||*(et)  -  i(l2  <  i:*(e.)  -  *(e.)l!a  +  -  x||2 

=  |jx(et)-x(et)|!2<%/M1/2)* 


which  establishes  (3.18).  I 


We  finally  note  that  a  superlinear  root-rate  of  convergence  [14]  can  be  achieved  in  the 
procedure  of  Theorem  3.7  if  we  cut  the  residual  r(et)  more  sharply  than  that  given  by 
(3.16)-(3.17).  In  particular  we  have  the  following. 

3.8  Theorem  (Superlinearly  convergent  procedure  for  least  2-norm  solution  of  a  linear 
program)  Let  the  assumptions  of  Theorem  3.7  hold  with  (3.16)  and  (3.17)  replaced  by 

(3.33)  r(et)  <  (£t-/2Tfp‘ 

for  some  £  >  0,  r)  6  (0,1)  and  p  >  1.  Then  the  sequence  (x(s,)}  converges  to  x,  the 
least  2-norm  solution  of  the  linear  program  (3.1),  at  the  superlinear  root-rate  of 

(3.34)  ||x(e,-)  -  x||2  <  <x£r?p’  for  i  >  i 

for  some  integer  i  and  a  defined  by  (3.27). 

Proof  From  (3.26)  and  (3.33)  we  obtain, 

(3.35)  ||x(e,)  -  x(e,)j|2  <  a^r)p‘ 


Since  x(et)  -  x  for  i  >  i  for  some  i  ,  (3.34)  follows  from  (3.35),  and  since  r]p'  -♦  0,  x(s,)  — > 

x.  I 
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